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Multivariate distributions are explored using the joint distributions of marginal sample quantiles. 
Limit theory for the mean of a function of order statistics is presented. The results include 
a multivariate central limit theorem and a strong law of large numbers. A result similar to 
Bahadur's representation of quantiles is established for the mean of a function of the marginal 
quantiles. In particular, it is shown that 

MO, *< x »= *• • • • ' x ^ - = £ z »>< + ° p{i) 

as n — > oo, where 7 is a constant and Z n ,i are i.i.d. random variables for each n. This leads 
to the central limit theorem. Weak convergence to a Gaussian process using equicontinuity of 
functions is indicated. The results are established under very general conditions. These conditions 
are shown to be satisfied in many commonly occurring situations. 

Keywords: central limit theorem; Cramer-Wold device; lost association; quantiles; strong law 
of large numbers; weak convergence of a process 

1. Introduction 

Let {(X^ , a| 2 ' , . . . , X^), i = 1, 2, . . .} be a sequence of random vectors such that for 
each j (1 < j < d), {X[ j \x¥\...} forms a sequence of independent and identically 
distributed (i.i.d.) random variables. For 1 < j, k < d, let Fj and Fj^ denote the distribu- 
tions of x\^ and {X^ ,X±), respectively. Let X^) A denote the ith order statistic (^th 
quantile) of {x[ j \ X ( 2 3) , . . . , X^ j) }. The vector (X^, . . . ,X^) corresponds to the ith 
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marginal order statistics. In this article, we study the asymptotic behavior of the mean 
of a function of marginal sample quantilcs: 



as n — > oo, where <f> : R -> R satisfies some mild conditions. 

Our results, Theorems 1.1 and 1.2 stated below, were motivated in part by one of the 
authors considering [10] the problem of estimating the parameters in a linear regression 
model, Y = a + f3X + e, when the linkage between the variables X and Y was either 
partially or completely lost. Were the linkage not lost, then the least-squares estimator 
for f3 would be given by (J27=i x iYi — nX n Y n ) / Y^i=i{Xi — X n ) 2 , where X n and Y n denote 
the sample means of {X\, . . . , X n ) and (Yi, . . . , Y n ). When the linkage is lost, a natural 
candidate to estimate /3 is the average of this expression over all possible permutations 
of the Yi's. As the term in the denominator and the second term in the numerator are 
permutation invariant, it remains to consider ^ X^es n £l=i x i^n(i)' This expression 
is bounded above by i Y17=i x n -iYn-.i and below by \ Yh=i x n : ; n-i+i , by the well- 
known rearrangement inequality of Hardy-Littlewood-Polya (see [8], Chapter 10). The 
asymptotic behavior of the lower bound can be deduced from that of the upper bound. 
The upper bound, ^ X)"=i X n : iY n: j, is a special case of (1.1). The problem of the loss of 
association among paired data has attracted a lot of attention in various contexts, such 
as the broken sample problem, file linkage problem and record linkage (see, e.g., [2, 4, 7]). 
See item (3) in Section 4 for further results and a very brief review of the literature. 

We shall first introduce some notation. We shall reserve {Ui} for a sequence of in- 
dependent random variables distributed uniformly on (0,1). Let U n: i be the ith order 
statistic of of (Ui, . . . , U n ). For a probability distribution function F and < t < 1, define 
F^ 1 ^) = ini{x : F(x) > t}. 

Let be a real- valued measurable function on R d . For < x,x\, . . . , Xd < 1, x = 
(x\, . . . , Xd), and 1 < j, k < d, define 




(1.1) 



i=l 



tpj(x) 



1&(x) 



^(Fr 1 (x 1 ),...,F^(x d )), 

tp(x,x,...,x), 

cV(x) 



(1.2) 
(1.3) 



(1.4) 



9 2 V>(x) 
dxj dxk ' 

tp jtk (x,...,x). 



(1.5) 



(1.6) 



We shall now introduce conditions on (f> that arc used in the results: 



(CI) The function ip( u i, ■ ■ ■ ) u d) is continuous at u\ = ■ ■ ■ = Ud = u, < u < 1. That is, 
tp is continuous at each point on the diagonal of (0, l) d . The function tp need not 
be bounded. 
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(C2) There exist K and Cq > such that 



Mm, . . . ,x„)| < K ^1 + Ma^l j for (n, . . . , Xd ) € (0, c ) d U (1 - c , 
(C3) Let fi n:i =i/{n+ 1). For l<j,k<d, 

~ yi(Mn:t(l - Hn:i)) 3 (lpj(l^n:i)) 2 ► / (x(l - x)) 3/2 (?/>j (x)) 2 da; < OO 

and 

1 " r 1 

j_ r — ^ 3/2 ~ / 3/2 ~ 

~ / ,(Mn:»(l — Mr»:i)) \4>j,k{ll>n: i)\ ^ / {x(l - X)) \lp j ik (x )\ dx < OC. 

71 j=l 

(C4) For all large m, there exist i4T = K(m) > 1 and <5 > such that 

d 

|^(y) - iHx) - (y - x, W(x)>| < A" ^ l(% " " + l^,fc(x)|), 

j,k=l 

whenever ||y — < 5 and mini<j<d yj(l — yj) > x(l — x)/m, where x = 

(x,...,x), y = (yi,...,y d ) € (0,l) d . Here, ||y||^ :=\yi\ + h \y d \ denotes the 

£i-norm of y and V^(x) denotes the gradient of tp. 

Condition (C3) holds if the functions (x(l - x)) 3 / 2 {i>j(x)) 2 and (x(l - x)f' 2 \^ k {x)\ 
are Riemann integrable over (0,1) and satisfy A-pseudo convexity for 1 <j, k < d. A 
function g is said to be if-pseudo convex if g(Xx + (1 — X)y) < K(Xg(x) + (1 — X)g(y)). 

To state the main results, recall the definition of 7 in (1.3). 

Theorem 1.1. Let , X^ 2 \ . . . , X±), i = 1, 2, . . .} be a sequence of random vectors 

such that for each j (1 < j < d), {X± , X%\ ■ ■ •} forms a sequence of i.i.d. random vari- 
ables. Suppose <f> satisfies conditions (C1)-(C2), Fj is continuous for l<j<d and 7 is 
Riemann integrable. Then, 

-, n 



n 

i=l 



as n — V co, where 7 = J Q 7(2/) dy . 

Note that we need only the independence of the jth marginal random variables, for 
each j. The result does not depend on the joint distribution of (X^\ . . . ,X^). 
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Theorem 1.2. Let X, = PQ , . . . ,xi d ') be i.i.d. random vectors. Suppose 4> satisfies 
conditions (C1)-(C4), Fj is continuous for 1 < j < d and 7 is Riemann integrable. Then, 



n n 

■4 E • • • . - >frr = -4 E ^ + °Hi 

1 v" u/. Jt l m \„u.l; l<„ 1 i\\ w. .c^ _ r/rrO') 



), (1.7) 



uAene Zn,/ = i E"=i E"=i Wj,<(i/n)^(i/(n + 1)), W^s) = I(t/, UJ < z) - 1 /or 1 < t < 
n and 7 is defined as in Theorem 1.1. Further, as n— > 00 , 

-±= 2 . . . , X<f,) - V^7 ^ iV(0, a 2 ), (1.8) 

v z— 1 

w/iere G jtk (x,y) = F^ k {F^ 1 {x), F l 7 1 (y)) and 

d -i „j, 

cr 2 = Km Var(Z„,i) = 2 V" / / x(l - y)if)j(x)i()j(y)dxdy 



3=1 



2 E / / i G i^{x,y) -xy)^j(x)ip k (y)dxdy. 
i<j<k<d Jo Jo 



This theorem can be extended to m functions (f>i, . . . ,(f> m simultaneously using the 
Cramer-Wold device (see [3]), as in the corollary below. Let ^pj(x;r) denote the partial 
derivative of (f) r (Fr (xi), . . . , F d ~ 1 (xd)) with respect to Xj evaluated at x\ — ■ ■ ■ = Xd — x. 

Corollary 1.1. Let cf>x, . . ., <j> m satisfy conditions (C1)-(C4). For I <r <m, if we define 
Tnttr) = Etl M*n%- ■ •>41l) <™ d 7r = E<f> r {Fr\U), F^(U), . . „F^(U), then 

-^=(T n (0i),...,T„(0 m )) - Vn(7i, . ..,7 m )^4jV(0,S) asn^oo, 
where the (r,s)th element oy iS ofT,, is given by 



3 = 1 



E / / { G 3\k{x,y)-xy){%l) j {x]r)^ k {y]s)+^ j {x\s)i) k {y\r))dxdy. 
i<j<k<d Jo Jo 



Proof. Use the Cramer-Wold device and Theorem 1.2. In computing ay iS , we used 
2ov . = lim (Var(Z„ 1 T + Z n 1 s ) - Var(Z„ 1 ,) - Var(Z„ 1 -)), 

where Z n>1 , r = iE?=iEli W A i(*7n)^-(i/(n + l);r). □ 
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Our results can be adapted to provide a suitable test statistic for testing equality of 
marginal distributions against various alternative hypotheses using suitable choices for <fi. 

Remark 1.1. Since the finite-dimensional distributions converge to multivariate normal 
distributions, the weak convergence to a Gaussian process indexed by t £ T (T being an 
interval of R) can be established under a condition such as equicontinuity of {<f>t : t £ T}. 

Remark 1.2. In Theorem 1.1, we just require i.i.d. for each component. No further 
assumptions are made on how the components are related. We need a stronger assumption 
in Theorem 1.2, namely, that the rows are i.i.d. random vectors. Interestingly, the variance 
of the limiting normal only depends on the 2-dimcnsional marginal distributions. 

Remark 1.3. Conditions (CI) and (C2) are, in general, easy to verify. Condition (C3) 
is used to control the behavior of the function ip around the neighborhood of (0, ... ,0) 
and (1,...,1) in (0, l) d . For example, if we suppose that is uniformly distributed 
over (0,1) for j = 1,2 and cj>(x, y) : = {(x + y)/2)~ a {l - (x + y)/2)- a , then (C3) holds if 
< a < 1/4. However, the first limit in (C3) fails if a > 1/4 and the second limit in (C3) 
fails if a > 1/2. 

Remark 1.4- By a compactness argument, condition (CI) is shown to be equivalent to 
(CI') For any c£ (0, |), lim^o w(c, S) = 0, where 

u(c,6) ■.= 8\ip{\%lj(x 1> ...,Xd) -l(y)\ ■ \xi-y\ <6,c<y, 

(1.9) 

Xi < 1 — c, 1 < i < d}. 

Proofs of Theorems 1.1 and 1.2 are given in Sections 2 and 3, respectively. The results 
are illustrated by means of examples and counterexamples in the last section. 

2. Proof of Theorem 1.1 

The main idea of the proof of Theorem 1 . 1 comes from the observation that 

'* • i 71 ■ 1 Jo 

The cases where i is close to 1 or n need to be carefully analyzed as ip could be unbounded 
near and 1. 

Proof of Theorem 1.1. Let = F^X^) for 1 < i < n, 1 < j < d. Therefore, 
{U¥\u!f\...} forms a sequence of i.i.d. uniformly distributed random variables and 
Fr 1 ([/. (j ' ) ) =X\ 3) with probability 1. Recall that denotes the ith order statistic 
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of u[ j \...,Uti\ We write fi n:i = EU { £! i = i/{n + 1). Recall, also, that ip(xi, . . . ,x d ) = 
4>(F^ 1 (xi), . . . , F d ~ 1 (xd)) and that 7(2;) = tp(x, . . . ,x). For any e <S (0, Co), 

1 n 1 n 

- £ 0(4Vi> • • • , = - E ^» : *. • • • > ^nfi) = r " + RnA + ^,2 + Rn,3 (2.1) 

i=l i=l 

almost surely, where 

1 ™ 

r« = -V#» : .)> 

n * — ' 

i=l 

^1 = ^ E W^: ) <,-,^'?i)-7(Mn:i)) ) 
l<z<en 

en<i<(l — e)n 



•Rn,3 — 

n 

(1 — e)n<2<n 



Since 7 is Ricmann integrable, the Ricmann sum 

/ 7(2/) dy = ^(Ff 1 ((7),..., FT" 1 (CO)) asn^, 





Thus, it remains to show that R n ^ — > l^ ■ s ^ as b-> 00 for i = 1, 2 and 3. 

For 1 < j < d, by then Glivenko-Cantelli lemma, sup^g/Q ^ \F n -j{x) — x\ — > a s - as 

n — > 00, where F n .j is the empirical distribution function of {U^ : 1 < i < n}. For 1 < 
i < j < d, we have 



l^?i-Mn:i| 



< 



n:i n 



- = \U^ t -Fn^(U^\ + -<-+ sup \x-F n;j (x)\. 



n ■ n n xe ( ,i) 

Hence, it follows that as n—> 00, 

5„ :=tnax{|U^ ) i -/* n:i | J l<i<n,l<i<i} ±2 VO. (2.2) 

Recall the definition of w(c, 5) in (1.9). Since C/^\ G (/x n: j — d n ,fi n: j + <5 n ) for 1 < j <d 
and for each integer z in the interval [ne, n(l — e)] , we have |^>(C/^ , . . . , U^f\) — 7(/i„ : i)| < 
w(e, <5 n ), provided <5 n < e/2. Hence, if <5„ < e/2, by (2.2) and (CI') (which is equivalent to 
(CI) by Remark 1.4 in Section 1), we have 

\Rn,2\ <\ E M U n% ■ ■ ■ > " 7(/*n:i)l < «>(*, *») ^> 

en<i<(l— e)n 
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as n — > oo. By (C2), 

d x 

\Rn,l\<Kj2 R nS,J + - \l((J>n:i)\+Ke, 
j—1 l<i<en 

where //„.,., = Ei< i<en for l<3<d. Clearly, if C/^ ) (ell)+1 < 2e, then 

fln.W^"" 1 E l7(^°' ) )|/(^' ) <2e). 

l<i<n 

Note that, with probability 1, U^},^ x < 2e for all large n and the right-hand side of 
the above inequality goes to J Q e |7(y)| dy a.s. as n — > oo. Hence, 



limsup|i?„,i| < (Kd+l)(e+ I 

n— ¥oo \ JO 



2e 

j(y)\dy) a.s. 



As I7I is integrablc, letting e tend to zero, we conclude that R n ,\ — > a ' s ' 0. A similar 
argument will show that R„.3 — > a s ' as n — > 00. This completes the proof of Theo- 
rem 1.1. □ 



3. Proof of Theorem 1.2 



As in the proof of Theorem 1.1, we introduce U^' = Fj(X^) for 1 < i < n, 1 < j < d. 
It follows that (U} 1] ,.-.,U^ d) ),l<i< n, are i.i.d. random vectors. For 1 < j,k < d, note 
that Gj,fc is the joint distribution of (u[ , u[ k ^). In particular, Gj t j(x,y) = mm{x,?/}, for 
1 j ' Using the notation introduced in Section 1, we outline some key approximations 
used in the proof of Theorem 1.2. In particular, (1.7) follows from 



v i=X 

v** ■ i 



'n - 

»=1 3=1 



1 n d n 
v i=x 1=1 £=l 



The proof of the first approximation, which is about y/n times the difference between 
the Riemann sum and the integral 7, is non-trivial and is handled in Lemma 3.3. We 
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use Bahadur's representation of quantiles in the last approximation. We start with some 
technical lemmas, the first of which is well known (see [6], page 36). 

Lemma 3.1. Suppose that U n: i < • ■ ■ < U n:n denote the order statistics of n indepen- 
dent random variables that are uniformly distributed over (0, 1). Then, for 1 < i < n, 

Ve,r(U n:i ) = tin -f~!". n -- i) <-, 
[n + 2) n 

Lemma 3.2. Under condition (C3), the limiting variance a 2 is well defined. 
Proof. It suffices to show that for 1 < j, k <d, 

Pi-= \G jt k{x,y) - xy\\ipj(x)ip k (y)\ dxdy < oo, (3.1) 

J Jo<x<y<l 

fo:=[[ \G jtk {x,y)-xy\\il>j(x)il> k {y)\dxdy<oo. (3.2) 

J J0<y<x<l 

To prove (3.1), we introduce Wj(x) := I{U^ < x) — x. Here, Wj(x) has mean and 
variance x(l — x). Furthermore, EWj(x)Wk(y) = Gj^(x,y) — xy. Thus, EWj(x)Wj(y) = 
x(l — y) when x <y. By the Cauchy-Schwarz inequality, j3 2 is bounded above by 

e /7^^/(i-y)) 1/4 IV' J (^ll^^)l((i-y)/^ 1/ V fc (y)ll^(y)|dxdy 

Jo Jo 

<E I' [\x/(l - y)) 1/2 (^(x)f{W 3 {x)f dxdy 

X E f f\(l - y)/x) 1/2 (My)) 2 (W k (y)) 2 dxdy 
Jo Jo 



JO 



a; 3/2 (l - x)(l - y)- 1/2 (^(x)) 2 da;dy 



o Jo 



x / f y x-^yil - y) 3/2 (Mv)) 2 dxdy 



o Jo 



= 4 / x^ 2 (l-x) 3 / 2 (^(x)) 2 dx 
Jo 

x f y 3 ' 2 {l-yf' 2 {My)) 2 dx<^. 
Jo 

Similarly, we can prove (3.2). This completes the proof of Lemma 3.2. □ 
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Lemma 3.3. Let (j> : (0, 1) — > R satisfy condition (C3). Suppose that the function 7 
associated with (f> and defined in (1.3) is Riemann integrable. We then have 



1 ™ f 1 
—=y^/Y(lJ, n: i)-^/n j(x)dx^>-0 



as n — » 00. 



Proof. As j'(x) =tpx(x)-\ \-ipd(%), condition (C3) implies that (x(l — x)) 3 / 2 (7'(x)) 2 

is Riemann integrable. We have 



1 " 



-1 

In I 7(x) dec 







n pi/n 

(j(^n:i)-l(x))dx 

l(t-l)/n 



i=l ' 



\Z^2J( / / l'(y)dydx~- / / j'(y)dydx 



i=l 

= Vn g n (yW(y)dy, 
Jo 



where 



9n(y) 



Note that 



y—(i— l)/n, if (i— l)/n< y< i/(n + 1),1 <i< n, 
y — i/n, if + 1) < y < i/n, 1 < i < n. 

y, if < y < 1/n, 

{ if l/n< y < 1 - 1/n, 

1 — y, ifl-l/n<y<l. 



Therefore, 



' 1 " 



rl > 

In I 7(2;) dx 



rl \ 2 



<n (9n(y)) 2 (y(l-y)r 3/2 dy {y{l-y)f\ 1 \y)) 2 dy. 
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Since the second term above is finite by (C3), Lemma 3.3 will follow if we can show that 
the first term goes to as n — > oo. Note that 

n f\g n {y)) 2 {y{i-y)r m Ay 

Jo 

<2 3 / 2 n / y/ydy + S /T=y~dy)+- y^ 2 (l - y)^' 2 dy 

\Jo Jl-X/n J n Jl/n 

< + (1 - n-Y 3/i n- 1/4 f y- 3/ \l y)- 3/i dy^O. Q 
W n Jo u 

Lemma 3.4. Let U n: i denote the ith order statistic of an i.i.d. sample of size n from 
the uniform distribution over (0,1). Define A m .,n = rii<i<, l {^n ■ ■ »(1 — U n -i) > (J, n :i0- — 
fJ-n-.i)/m}. We then have lim m ^ P(An.n) = 1. 

Proof. By symmetry considerations, we only need to prove 



m— >oo 



n>l 



lim supPI P| {U n: i(l - U n:i ) > Hn; i(l - fJ-n:i)/m} ) = 1. (3.3) 

l<i<ra/2 



For any e > 0, we can choose n such that for all n > no, P(U n: (( n +i)/2) > 2/3) < e/2 
and 



W {U„:i(l-U n:i ) > (J,„:i(l- (J, n :i)/m} 

S<i<n/2 

>P[ H { U n:i>3»n-,i/m})-P({U n:{{n+1)/2) >2/3}) 
\<i<n/2 ' 

>P[ f| {Un:i>3»n;i/m} \ -e/2. 
l<i<™/ 2 

Obviously, we can find a constant mp such that for all m > mo, 



l<n<no 



jSUJ) P( P| {?7„ : i(l - ?7„ :i ) >/i„ :i (l -/i„:i)/m} j > 1 - £. 
v l<i<ri/2 



If we can choose a constant mi such that for all m > mi, 



n>n 



sup P ( p| {[/•„: , > 3^„ : i/m} ) > 1 - e/2, 

l<i<n/2 
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then, for all m > max(mo, mi), 

SUpPI P| {U n:i {l-U n ;i)> H ri ;i(l- fl n:i )/m}) >l-E. 



v l<i<n/2 

Therefore, the proof of Lemma 3.4 reduces to establishing that 



' n>l 



lim sup P Pi {U n:i > fin-.i/m}) = 1. (3.4) 

v l<i<n/2 



Recall the representation formula for the order statistics from a sequence of uniform 

random variables, U n : < =* 5, / SVi+i , where ei , . . . , e„ + i are i.i.d. exponentially distributed 
random variables with P(e^) = 1 and Si = e\ + • • • + e^. If 

i<i<n<oo i) n+ i/(n + 1) m 

then, for all 1 < i < n/2, we have g + f/(n+i) > V m - This, in turn, implies that, as 
m — > co, 

lim sup P( H In 7T > — i ) > lim P(M > 1/m) = P(M > 0). 

m^oo„> 1 \ K J < J 1 , 2 l i r.+l/( n + 1 ) m J/ m ^°° 

Since S n /n — )- a s - 1 as n — > oo, we have P(M > 0) = 1. This implies (3.4) and hence 
Lemma 3.4 follows. □ 

Proof of Theorem 1.2. We write 
1 n 

~T Yl ^ U n% • ■ - U n\) = In + t„ = S nA + S n , 2 + £„, 

V n i=l 



where 



d n 



Sn.l = n- 1 ' 2 Y.YtVni - ^■■i)V j {Hn:i): 
3=1 J=l 

Sn,2 = S n> i, 



n „i 

e„ = n _1/2 ^7(^ Il:i ) - / 7(2 ; )da ; - 
i=l - 70 

By Lemma 3.3, e„ — )■ as n— > oo. We shall now show that 5 n> 2 — > p as rt — > oo. 
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Since max{|J7^ i - fj>n-.i\ ■ 1 < i < n, 1 < j < d} — > a s - 0, by (C4) we have 



d n 



\S n ,2\lA m , n <^P-J2 El(^fi-Mn:i)(^-^: i )l(l + fe(^: i )l). 
Vn j,fc=li=l 

By condition (C3), Lemma 3.1 and the Cauchy-Schwarz inequality, we obtain 

n 

^E^(^n? i -Mn:i)(^-^--)l(l + lfe(Mn: < )l) 
V?1 »=1 

1 " 1 1 

- T/2 y^Mn:i(l - ^n:i)\^j,k{^n:i)\ + —j= ■= -h + -h + J3 + —7=, 
i—1 v v 

where J\ = n^ 3 ^ 2 J2i<i<^^n: i(l - Mn: »)|^j,fc(Mn: i)| and J 2 , J3 are similarly defined 
over y^n <i <n — y/n and n — -^/n <i<n, respectively. We have 



< - E G u «:i( 1- Mn:i)) |^,fe(Mn:i)| 



l<i<v/n 

(x(l-x)) 3/2 |^- fe (x)|dx^0 
as n — > 00. Similarly, J3 — > as n — > 00. Also, as n—> 00, 

J2<-^Jl E -Mn : i)) 3/2 |^j,fe(Mn:i)l ~> 0. 

V / ra<i<«— \/n 

That is, we have shown that as n — > oo, for any given large m, 5 n> 2^4 TO „ — > P 0. We can 
now choose a sequence of m = m n — > oo such that S n ^lA m „ as n — > oo. By Lemma 
3.4, T^o ^ — !> p and hence Su^Ia^ n as m — > oo. Therefore, 0. 

Define Wj,.j(x) = /(C/ £ (j) < x) - x for 1 < j < d and 1 < I < n. Observe that W jA is Wj 
defined in the proof of Lemma 3.2 and that F~j(*-) = U^\. By Bahadur's representation 
of quantiles (see, e.g., [1] or [9]), 



sup \F n .j(t) - t + F~Ut) - t\ = 0(n" 3/4 logn) a.s. for 1 <j < d. 

0<t<l 

Hence, 

1 " 

S n .i = —=y^Z n .i + o(l) a.s., 
v i=i 
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where, for each n, 

n d 



1 

Z n ,t = - ^2^2Wj,t{i/n)lpj(fi n :i) 
U i=l 3=1 

are i.i.d. random variables with mean zero and 

d ^ n 

Var(Z„ fl ) = ^ — Cov{W j (h/n),W k {i/n))ip j (fi n:h )i; k (fi n: , 



j,k—l h,i=l 



d ^ n 

— ^ (G jtk (h/n,i/n) - hin~ 2 )ipj(fj, n: h)j>k(^n: i) 



n 

j,k=X h 7 i—l 
d -i -i 

j,k=i Jo Jo 

Recall that Gj.k is the joint distribution of (JJ± , u[ ) and that Gj.j(x,y) =min(x, y). 
To establish the convergence above, fix j, k and split the second sum above into cases 
according to whether h, or i, is: less than en; between en and (1 — e)n; greater than 
(1 — e)n. For example, when we sum over en < h,i < (1 — e)n, then it converges to 
J e ~ £ J e ~ £ H(x,y)dxdy, where H(x,y) = (G j>k (x,y) - xy)ipj(x)ipk(y)- The sum over 
1 < h < en and en < i < (1 — e)n can be shown to converge to J* e H(x,y)dxdy, 
which, from the method of proof of Lemma 3.2 and condition (C3), can be shown to 
converge to as e — > 0. Similar convergences hold for other ranges of h and i. 

It is now easy to see that the limit above can be written in the form of a 2 as stated in 

Theorem 1.2. Note that \Z nA \ < EL ^EIU \4>j(jh>:i)\- H (VV^ELi IV>i(Mn:i)l -> 
for j = 1, 2, . . . , d, then the Lindcbcrg-Lcvy condition holds. To see this, note that 

n ' In ' n ^— ' 

i=i / »=i i=l 

By (C3), it is enough to establish that I n = ± Efc=i(s+r(l - ^Ti)) _3/2 "> °- Since 

H<i<(n+l)/2 (n+l)/2<i<n ' 

we have, by the Lindcberg-Lcvy central limit theorem, 

Hence, S n ,i -» dlst iV(0, er 2 ), which completes the proof of Theorem 1.2. □ 
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4. Examples and counterexamples 

We give some examples to show our results and counterexamples to illustrate that con- 
ditions (CI) and (C2) are necessary for Theorem 1.1 to hold. 

(1) Let Z be a random variable with a continuous distribution function F. Let 
9j, 1 _■ 3 ' _• d, be continuous monotonically increasing functions. For each 1 < j < d, 
suppose . are independent random variables having the same distri- 

bution as gj(Z). Applying Theorem 1.1 and assuming necessary integrability con- 
ditions, we get, after changing the variable y = F(x), 



n 

~ E ^ X n ■ • • . X n:i) ^ E ^9l &),..., 9d (Z)) 



as n 



(2) Let (X[ X \ . . . ,x[^), (X%, . . . ,X% ), . . . be independent random vectors having 
the same distribution as (U\, . . . ,Ud), where the Uj's are uniformly distributed 
over (0, 1). Let Fj^ be the joint distribution of Uj and Uk- Suppose <p: (0, l) d — > R 
is defined by <fi(xi, . . . , Xd) = x^x^ 2 ■ ■ ■ x^ d , where <Xj > 1. Let M = a± + ■■■ + ad- 
Then ip = <fi, 7(2:) = x M and ipj(x) = atjX 1 for 1 < j < <i. We have 



1 _ 



(d) >o d a.s 



n '- 1 ' y n:l ' M + l 

i—l 



and 



where 




M+l 



■7V(0,a 2 ), 



,2 



2 ^ 1 d 



l<j<k<d y ' K ' j=l 

(3) The study of the statistical properties when there is a loss of association among 
paired data has attracted a lot of attention in various contexts, such as the broken 
sample problem, file linkage problem and record linkage. For example, DeGroot 
and Goel initiated the investigation of estimating the correlation coefficient of a 
bivariate normal distribution based on a broken random sample in [7] . Copas and 
Hilton proposed statistical models to measure the evidence that a pair of records 
relates to the same individuals in [4]. Chan and Loh considered an approximation 
of the likelihood computation for large broken sample in [5]. Bai and Hsing, in 
[2], proved that there does not exist any consistent discrimination rule for the 
correlation coefficient, p, between X and Y when the paired sample is broken, 
that is, the association between X and Y is lost. When pairing is lost, the X's and 
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Y's behave as if they were independent as far as first order asymptotics, such as 
the law of large numbers (see Theorem 1.1), are concerned. 

Example 1. This example shows that condition (CI) is necessary for Theorem 1.1 to 
hold. Let 



1, ifO<x = y<l, 
0, if0<a;^y<l. 



Let {(Xi,Yi) : 1 < i < n} be a sequence of i.i.d. random vectors. We further suppose that 
Xi and Yi are independent and uniformly distributed over (0,1). Since (f> is bounded, 
(C2) holds, whereas (CI) does not hold. We further note that P(X n: i ^ Y n - i) = 1 for 
l<i<n. Hence, J2"=i <t>{ x n: i, Y n - <) = 0, but J Q <f>(x, x) Ax = 1. 

Example 2. This example shows that condition (C2) is necessary for Theorem 1.1 to 
hold. Let S = (0, l) 2 and, for m > 1, define S m = (^pj, l) 2 , S m = S m -i \ S m . Let L m 
be the union of three line segments: 

{/ m \ m If/ m \ m 

— TT'^ ) : — TT - y - 1 f u 1 P>— T : ^T^^ 
\ m +1 / m + 1 J L V m+1 J m + l 

U ^ (x, x) : < x < 



777+1 



Let C m be the region inside S m which is distance e m within L m , where e m is chosen 
so that the area of C m is ra~ 8 . Write A m = S m \ C m . Let ^ be a continuous on (0, l) 2 
satisfying (f> = 1 on the diagonal, <f> = m 3 on A m and 1 < <fi _■ "7 3 on C m . 

Let {Ui,Vj :1 < 7,j < n} be independent and uniformly distributed on (0,1). Define 
W„ = (U n:n ,V n:n ) and a n = |~7i 1//2 ] . Observe that 



1 11 

-Y J 4>(U n .. u V i:n )>-<t>{W n )>- V m 3 I(W n eA m ) 

77 * — ' 77 71 * — ' 

m>s/n 

m>\/n 



We now claim that 



J(lf n e5„J^l a.s., (4.2) 
l( W„€ |J C m ) — ►() a.s. (4.3) 

m>i/n 
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as n — > oo. To prove (4.2), observe that 

P(W n i SaJ < 2P(U n:n < a„/(l + a n )) = 2 ( 1 - - J— ) « 2e"^. 

This yields 

oo 
ri=l 

which, by the Borel-Cantelli lemma, implies that I(W n $ S a „ i-o.) = 0, proving (4.2). To 
prove (4.3), it suffices to show that 



n=l ^ m>y/n 



oo . (4.4) 



We again consider the nth term in the series in (4.4): 



p(w n e [j C m ^j< p (W n eC m ) 

< ^ P((Ui, Vj) e C m for some 1 < i, j < n) 

m>i/n 

<n 2 ^ P((C/i,^i)eC m ) = n 2 ^ m- 8 <Cn" 3 / 2 

and hence the infinite series in (4.4) is finite. This completes the proof of (4.3). Thus, by 
(4.1), i Y^i=i ^{Un-. i, Vi; n ) diverges. Furthermore, it is easy to see that condition (C2) 
does not hold. If (C2) were satisfied, that would imply boundcdness of 7 over (1 — cq, l) 2 , 
which is not the case. This completes the construction of the counterexample. 
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